Chapter 8

  1. An open path has variation in all the variables along the path. A closed path has at least one variable with no variation along it (i.e. you have controlled for at least one variable along it).

2a. List every path from X to Y:

X–> A–> Y

X–> C <– D –> Y

X<–B–>Y

X–> C <– D –> B –> Y

X<–B<–D–>Y

2b. Front door paths: X–> A–> Y

2c. Open back door paths:

X<–B–>Y

X<–B<–D–>Y

X–> C <– D –> Y

X–> C <– D –> B –> Y

2d. The following variables must be controlled for to identify the effect of X on Y: D, B (not C because it is a collider)

3a. Causal diagram below caption

3b. Front door paths:

income–>health

income–>healthcare_qual–>health

3c. Back door paths:

income<–parent_SES–> health

3d. Direct effect paths:

income –> health

3e. Good paths:

income–>health

income–>healthcare_qual–>health

Bad paths:

income<–parent_SES–> health

  1. Front door path (c)

5a. Popularity is the outcome, or effect, variable

5b. If you controlled for popularity, then you would be unable to assess the impact of the treatment variables (teachingquality, numberofpubs) on the outcome (popularity) because you wouldn’t be allowing it to vary. By holding icontrolling for popularity, you “close” the paths leading to it.

6a. All paths from lockdown to recession:

lockdown –> recession

lockdown–> unemployment–> recession

lockdown<–prioreconomy–>unemployment–>recession

lockdown<–prioreconomy–>recession

6b. Front door paths:

lockdown –> recession

lockdown–> unemployment–> recession

6c. If we controlled for unemployment, we would be able to directly isolate the direct effect of lockdown on recessions. We would not be able to account for the way in which lockdown affects recession through the mechanism of unemployment, because we closed this path by controlling for unemployment.

6d. It would be difficult to measure PriorEconomy. The timeline defining the range for “prioreconomy” would have to be determined, such that we could identify explicitly what time period delineates “prior” and “current” economic states. Also, there are a multitude of measurements of the state of the economy, including GDP, stock market, GNP…so one metric would have to be chosen and justified. Also, recessions might be hard to measure quantitatively; namely, the time period of economic decline that is sufficient to determine a period a recession would have to be identified.

6e. Additional variables that might be relevant include healthcare system quality (which would inform how long the lockdown lasts) and national debt (which would inform how much the government can afford to offer in stimulus checks, affect the state of the prior economy, and affect the likelihood of a lockdown):

lockdown <–nat_debt–> prioreconomy–> recession

lockdown <–nat_debt–> prioreconomy–> stimulus–> recession

lockdown <–nat_debt–> prioreconomy–> stimulus–> unemployment–> recession

  1. Familial_SES, work ethic, and networking connections could confound the relationship between high education and income. A couple examples of bad paths in a causal diagram for this question might be:

higher_ed<–familial_SES–> income

higher_ed<–work_ethic–> income

higher_ed<–familial_SES–>networking_connections–>income

Chapter 9

  1. Natural experiment (B)

2a. We can partition the variation in treatment in order to isolate the components that don’t have any open back door paths. Then, we focus on the part of the variation of treatment that does not have a back door path, and disregard the sources of variation in treatment that do have back door paths.

2b. In a randomized controlled experiment, we look for randomly assignment of treatment and control groups.

  1. Four major differences between randomized and natural experiments are as follows: in natural experiments, 1) there occasionally will be “backdoors” between the “natural randomness” and the outcome, due to the natural nature of the treatment assignment, which will complicated our ability to isolate the source of variation of treatment we’re interested in (however, if we can control for those variables to shut the backdoor paths down, then we can proceed) 2) people might not even realize they are in an experiment (higher degree of experimental realism), so the reactions you get may be more realistic 3) we see only the effect on people who are sensitive to the natural treatment (if the effect would be different for a different group of people, we won’t know, because we don’t control who treatment is administered on)–whereas in purely random experiment, we could administer treatment across a wide variety of people, and thus isolate the effect of the treatment on each group; 4) people are less inclined to believe in the exogeneity of randomization in natural experiments–it is hard to convince people that treatment is assigned sufficiently randomly when the researcher doesn’t control the assignment, 5) sample sizes tend to be bigger in natural experiments, and your sample isn’t self-selecting of people who are willing to volunteer for experiments, so there is a possibility of greater representativeness of the population.

  2. Does providing unemployed or precariously employed individuals with 4,000 dollar quarterly stimulus checks stimulate economic participation and consequently boost economic growth? This question is unfeasible to answer with a randomized experiment due to 1) the costs associated with it (distributing 16000 dollars to many people annually would be an expensive study), and 2) due to its scope. Assessing an economy-wide impact would be difficult in a pure experimental setting, and could be more feasibly pursued in a natural experiment context. Furthermore, the “realistic” effects of such a major change on the economy could be more accurately assessed through a natural experiment.

  3. Exogenous variation refers to variation that is not caused by any part of the data generating process. Exogenous variation can be researcher controlled or ‘naturally’ determined. In natural experiments, ideally, exogenous variation derives from an “external source” that it is close enough to random that we can treat it basically as random, and assume that the exogenous variation delivered is an essentially random (or quasi-random) assignment of treatment.

6a. Causal diagram below (exogenous variation: COVID; treatment: social isolation/quarantine; outcome: physical health) caption

6b. Paths from source of exogenous variation to outcome:

COVID–> social_isolation–> mental_health–>physical_health

COVID–> mental_health–>physical_health

6c. The following paths need to be closed:

COVID–> mental_health–>physical_health

6d. I don’t think that COVID is in fact ‘randomly’ distributed–whether or not you contract it is a product of many factors, including occupation, SES, zip code, social behavior. So these variables might affect the quasi-randomness of the treatment assignment. Furthermore, general activity level/socialization level could affect both one’s likelihood of contracting COVID and one’s mental and physical health. This is a possible neglected variable.

  1. Because the exogenous variation has no back doors, so nothing it predicts can have back doors either (B)

8a. US foreign relations with countries other than China could be a possible confounding variable. For instance, if the US is engaging harmfully with another country, then this could affect Brazilians’ opinion of the US independent of whether or not they are affected by tariffs. Additionally, if the US’ relations with this other country were severed as a result of this behavior, and it could no longer trade with the country, then this might affect whether the US could afford to impose tariffs on imports to China. This backdoor path would be as follows:

tariffs_on_China<–US_foreign_relations–>opinion_of_US

8b. I would not necessarily believe the result, due to the open backdoor path mentioned above. The US’s imposition of tariffs and external opinions of the US could both be affected by the state of US foreign relations.